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Abstract 

A secure reliable multicast protocol enables a process to send a message to a group of recipients such 
that all correct destinations receive the same message, despite the malicious efforts of fewer than a 
third of the total number of processes, including the sender. This has been shown to be a useful tool 
in building secure distributed services, albeit with a cost that typically grows linearly with the size 
of the system. For very large networks, for which this is prohibitive, we present two approaches for 
reducing the cost: First, we show a protocol whose cost is on the order of the number of tolerated 
failures. Secondly, we show how relaxing the consistency requirement to a probabilistic guarantee 
can reduce the associated cost, effectively to a constant. 

1 Introduction 

Communication over a large and sparse internet is a challenging problem because communication 
links experience diverse delays and failures. Moreover, in a wide area network (WAN), security is 
most crucial, since communicating parties are geographically dispersed and thus are more prone 
to attacks. The problem addressed in this paper is the secure reliable multicast problem, namely, 
how to distribute messages among a large group of participants so that all the (correctly behaving) 
participants agree on messages' contents, despite the malicious cooperation of up to a third of the 
members. 

Experience with building robust distributed systems proves that (secure) reliable multicast is 
an important tool for distributed applications. Distributed platforms can increase the efficiency 
of services and diminish the trust put in each component. For example, the Omega key manage- 
ment system |]19[ provides key backup, recovery and other functions in a penetration-tolerant way 



using the Rampart distributed communication infrastructure [18|. In such a service and others, 
distribution might increase the sensitivity to failures and malicious attacks. To address issues of 
availability and security, distributed services must rely on mechanisms for maintaining consistent 
intermediate state and for making coordinated decisions. Reliable multicast underlies the mecha- 
nisms used in many infrastructure tools supporting such distributed systems (for a representative 
collection, cf. |T(|). 
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Previous work on the reliable multicast problem suffers from message complexity and compu- 
tation costs that do not scale to very large communication groups: Toueg's echo_broadcast [22, || 



requires 0(n 2 ) authenticated message exchanges for each message delivery (where n is the size 
of the group). Reiter improved this message complexity in the ECHO protocol of the Rampart 
system |l7j through the usage of digital signatures. The ECHO protocol incurs 0(n) signed mes- 
sage exchanges, and thus, message complexity is improved at the expense of increased computation 
cost. Malkhi and Reiter [11] extended this approach to amortize the cost of computing digital 



signatures over multiple messages through a technique called acknowledgment chaining, where a 
signed acknowledgment directly verifies the message it acknowledges and indirectly, every message 
that message acknowledges. Nevertheless, 0{n) digital signatures are in the critical path between 
message sending and its delivery. For a very large group of hundreds or thousands of members, 
this may be prohibitive. 

In this paper, we propose two approaches for reducing the cost and delay associated with 
reliable multicast. First, we show a protocol whose cost is on the order of the number of tolerated 
failures, rather than of the group size. Secondly, we show how relaxing the consistency requirement 
to a selected probabilistic guarantee can reduce the associated cost to a constant. The principle 
underlying agreement on message contents in previous works, as well as ours, is the following (see 
Figure |): For a process p to send a message m to the group of processes, signed validations 
are obtained for m from a certain set of processes, thereby enabling delivery of m. We call this 
validation set the witness set of m, denoted witness(m). Witness sets are chosen so that any 
pair of them intersect at a correct process, and such that some witness set is always accessible 
despite failures. More precisely, witness sets satisfy the Consistency and Availability requirements 
of Byzantine dissemination quorum systems (cf. [12|), as follows: 



Definition 1.1 A dissemination quorum system is a set of subsets, called quorums, satisfying: 
For every set B of faulty processes, and every two quorums Q\,Qi, Qi H Q2 % B (Consistency ). 
For every set B of faulty processes, there exists a quorum Q such that Q C B (Availability,). 

Using dissemination quorums as witness sets for messages, if a faulty process generates two 
messages m, m' with the same sender and sequence number and different contents (called conflicting 
messages), the corresponding witness sets intersect at a correct process (by Consistency). Thus, 
they cannot both obtain validations, and at most one of them will be delivered by the correct 
processes of the system. Further, dissemination quorums ensure availability, such that a correct 
process can always obtain validation from a witness set despite possible failures. For a resilience 
threshold t < [in — l)/3j, previous works used quorums of size \{n + t + l)/2], providing both 
consistency and availability. 

Our first improvement in the 3T protocol drops the quorum size from \(n + t + l)/2] to 2t + 1. 
When t is a small constant, this improvement is substantial, since we need only wait for 0(t) 
processes, no matter how big the WAN might be. Briefly, the trick in bringing down the size of the 
witness sets is in designating for every message m a witness set W^r[rn) of size 3t + 1, determined 
by its sender and sequence number. A message must get validations from 2t + 1 processes, out of 
the designated set of 3t + 1, in order to be delivered. 

Our second improvement stems from relaxing the requirement on processes to (always) agree 
on messages' contents to a probabilistic requirement. This leads to a protocol that consists of 
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Figure 1: Framework of Secure Reliable Multicast Protocols 



two-regimes: The first one, called the no-failure regime, is applied in faultless scenarios. It is 
very efficient, incurring only a constant overhead in message exchanges and signature computing. 
The second regime is the recovery regime, and is resorted to in case of failures. The two regimes 
inter-operate by having the witnesses of the no-failure regime actively probe the system to detect 
conflicting messages. We introduce a probabilistic protocol combining both regimes, active , whose 
properties are as follows: 

• Given a resilience threshold t, activet can be tuned to guarantee agreement on messages 
contents by all the correct processes on all but an arbitrarily small expected fraction e of the 
messages. Those messages that might be subject to conflicting delivery are determined by the 
random choices made by the processes after the execution starts, and hence a non-adaptive 
adversary cannot effect their choice. 

• The overhead of forming agreement on message contents in activet in faultless circumstances 
is determined by two constants that depend on e only (and not on the system size or t). 

In this paper, we assume a static set of communicating processes. It is possible, however, to 



use known techniques (e.g., in the group communication context one can use [17]) to extend our 
protocols to operate in a dynamic environment in which processes may leave or join the set of 
destination processes and in which processes may fail and recover. 

The rest of this paper is organized as follows: In Section || we formally present our assumptions 
about the system. Section || presents a precise problem definition, and demonstrates feasibility 
through a simple solution. Section ||| contains the 3T protocol description, and Section || contains 
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the activet protocol description. In Section || we analyze the load induced on processes participating 
in our protocols. We conclude in Section 

2 Model 

The system contains n participating processes, denoted P = {pi,P2, ■ ■ ■ iPn}, up to t < [(n — l)/3j 
of which may be arbitrarily (Byzantine) faulty. A faulty process may deviate from the behavior 
dictated by the protocol in an arbitrary way, subject to cryptographic assumptions. (Such failures 
are called authenticated Byzantine failures.) 

Processes interact solely through message passing. Every pair of correct processes is connected 
via an authenticated FIFO channel, that guarantees the identity of senders using any one of well 
known cryptographic techniques. We assume no limit on the relative speeds of different processes 
or a known upper bound on message transmission delays. However, we assume that every message 
sent between two processes has a known probability of reaching its destination, which grows to 
one as the elapsed time from sending increases. The last property is needed only in the activet 
protocol, and ensures the delays can be set in the protocol to guarantee that fault notification 
reaches all correct processes. In practice, this can be realized using quality guaranteed out-of-band 
communication for control messages. 

A system that admits Byzantine failures is often abstracted as having an adversary working 
against the successful execution of protocols. We assume the following limitations on the adversary's 
powers: The adversary chooses which processes are faulty at the beginning of the execution, and 
thus its choice is non-adaptive. Every process possesses a private key, known only to itself, that 
may be used for signing data using a known public key cryptographic method (such as pi] ). Let 
d be any data block. We denote by dxi the signature of pi on the data d by means of p^s private 
key. We assume that every process in the system may obtain the public keys of all of the other 
processes, such that it can verify the authenticity of signatures. The adversary cannot access the 
local memories of the correct processes or the communication among them, nor break their private 
keys, and thus cannot obtain data internal to computations in the protocols. 

Our protocols also make use of a cryptographically secure hash function H (such as MD5 |^0[ ) . 
Our assumption is that it is computationally infeasible for the adversary to find two different 
messages m and m' such that H{m) = H(m'). 

3 The Problem Definition and a Basic Solution 

A reliable multicast protocol provides each process p with two operations: 
WAN-multicast (m): p sends the multicast message m to the group. 

WAN-deliver (m): p delivers a multicast message m, making it available to applications at p. 
For convenience, we assume that a multicast message m contains several fields: 

sender(m): The identity of the sending process. 

seq(m): A count of the multicast messages originated by sender. 
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payload(m): The (opaque) data of the message. 



The protocol should guarantee that all of the correct members of the group agree on the delivered 
messages, and furthermore, that messages sent by correct processes are (eventually) delivered by 
all of the correct processes. More precisely, the protocol should maintain the following properties: 

Integrity: Let p be a correct process. Then for any sequence number s, p performs WAN- 
deliver(m) for a message m with seq(m) = s at most once, and if sender(m) is correct, 
then only if sender(m) executed WAN-multicast(m) . 

Self-delivery: Let p be a correct process. If p executes WAN-multicast(m) then eventually p 
delivers m, i.e., eventually p executes WAN-deliver(m) . 

Reliability: Let p, and pj be two correct processes. If p, delivers a message from p& with sequence 
number seq (via WAN-deliver(m)) , then (eventually) pj delivers a message from p^ with 
sequence number seq. 

(Probabilistic) Agreement: Let pi and pj be two correct processes. Let pi deliver a message 
m, pj deliver m', such that sender(m) = sender(m') and seq(m) = seq(m'). Then (with very 
high probability) pi and pj delivered the same message, i.e., payload(m) = payload(m') 



The problem statement above is strictly weaker than the Byzantine agreement problem [10], 
which is known to be unsolvable in asynchronous systems ||. This statement holds even if we use 
the unconditional Agreement requirement. The reason is that only messages from correct processes 
are required to be delivered, and thus messages from faulty processes can "hang" forever. Note 
that there is no ordering requirement among different messages, and thus the problem statement 
is weaker than the totally ordered reliable multicast problem, which can be solved only probabilis- 
tically [13, 14]. The reliable multicast problem is solvable in our environment, as is demonstrated 



by the E protocol depicted in Figure y (which borrows from the Rampart ECHO multicast proto- 
col |p7| ). Throughout the protocol, each process pi maintains a delivery vector delivery^ containing 
the sequence number of the last WAN-delivered message from every other process, delivery^ is 
initially set to zero. 

This protocol assumes the presence of a stability mechanism, (SM), utilized by the processes, 
that allows each process to learn when a message has been delivered by other processes, for purposes 
of re-transmission and garbage collection. The details of such a mechanism are omitted (for the 
details of such a mechanism, in the context of a group communication system, see e.g. 0). However, 
we note that by properly tuning timeout periods and by packing multiple messages together (e.g., 
by piggybacking on regular traffic), the cost of such a mechanism is negligible in practice. The 
mechanism must assure the following properties: 

SM_Reliability: Let pi and pj be two correct processes. If p% performs WAN-deliver(m) , then 
eventually pj knows that pi performed WAJV-deJiver(m) . 

SMJntegrity: Let pi and pj be two correct processes. If pj learns from the stability mechanism 
that pi performed WAJV-deJiver(m) , then indeed, p, performed WAN-deliver(m) . 



5 



Protocols can be used as components in more complex protocols; to separate the messages of 
disparate protocols, each contains an initial field indicating to which protocol it belongs. Mes- 
sages within each protocol similarly contain fields indicating their role in the protocol. (E.g. as 
acknowledgements.) 



1. For a process pi to WAN-multicast message m, (such that sender{m) = pi), and pi has 
previously sent messages up to sequence number seq(m) — 1, process pi sends 

<E, regular, pj, seq(m), H{m)> 

to every process in P, and waits to obtain A = {<E,ac'k,pi,seq(m),H(m)>K :j \ Pj € P'}, a 
set of signed acknowledgments from any set P' of \(n + t + l)/2] distinct processes. It then 
sends the following to every process in P: 

<E, deliver, m, A> . 

2. When pi receives a message <E, regular, pj, cnt, h> from pj, and no conflicting message was 
previously received from pj , then pi sends back to pj a signed acknowledgment 

<E, ack, pj , cnt, h>Ki ■ 

3. When pi receives a message <E, deliver, m, A>, such that A contains a valid set of acknowledg- 
ments for <sender(m) , seq(m) , H (m)> , (acknowledgements for m from \{n + t + l)/2] distinct 
processes) and such that deliveryi [sender (m)] = seq(m) — 1, pi performs WAN-deliver(m) and 
sets deliv ery {[sender {m)] to seq(m). If a timeout period has passed and pj is not known to 
have delivered m, pi sends <E, deliver, m, A> to pj. 



Figure 2: The E protocol 

The E protocol ensures secure reliable multicast. However, it is inefficient in faultless runs, 
incurring an overhead (in addition to 0(n) transmissions for multicast) of 0(n) signatures and 
message exchanges per delivery; this might be an intolerable overhead for very large groups. We 
shall improve it in the next section. 

We now proceed to verify that the E protocol satisfies Integrity, Self-delivery, Reliability and 
Agreement. This is the basic proof in this paper; while simple and straightforward, it facilitates 
later proofs of optimizations. 

Definition 3.1 Two acknowledgements, <E,&ck,pk,cntk,hk>Ki and <E,a.ck,p£,cnt£,h£>K j , con- 
flict if pk = Pe and cnt^ = cntg, but hk ^ h,£. 

In proving the security of the E protocol, we will make use of the fact that in any run of the 
protocol, no correct process multicasts conflicting messages and no correct process signs conflicting 
acknowledgements. In addition, the lemma below states several properties that relate acknowledge- 
ment sets to the corresponding message transmissions: 
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Lemma 3.1 In any run of the E protocol, the following hold: 

1. If process p is correct in a run, then a correct process signs an acknowledgement for a message 
m with sender (m) = p only if p multicasts message m. 

2. If process p is correct in a run, and the run contains a set of valid E acknowledgements for a 
message m with sender (m) = p, then p WAN-multicast message m in the run. 

3. The run does not contain two valid sets of conflicting E acknowledgements. 

Proof : A correct process q signs an acknowledgement for m by a correct sender only when it 
receives m over an authenticated channel from sender(m). The first property then follows from 
the fact that a signed acknowledgement from q of the form <E, ack,p, cnt, H{m)>K q contains the 
identity of the sender p = sender(m). 

To see that the second property holds, recall that a valid acknowledgement set contains ac- 
knowledgements from \{n + 1 + l)/2] distinct processes, which must contain at least t + 1 pro- 
cesses. Hence, a valid set of acknowledgements must contain an acknowledgement signed by a 
correct process. Since p is correct, by property ([l]) of the lemma, m was multicast by p. 

Finally, note that two sets of valid acknowledgements must intersect in at least one correct 
process. Since a correct process never signs conflicting acknowledgements, the third property 
follows. □ 

Theorem 3.2 (Integrity) Let pi be a correct process participating in the E protocol. Then for any 
message m, pi performs WAN -deliver {m) at most once, and if sender(m) is correct, then only if 
sender{m) executed WAN-multicast{m) . 

Proof : That pi suppresses duplicate deliveries is immediate from the protocol. It is left to show 
that if sender(m) is correct, it must have sent m. To prove this fact, consider that for pi to deliver 
m, pi must have obtained a set A of valid acknowledgments for m. By Lemma |3.1| (^), it follows 
that m must have been sent by sender(m). □ 

Theorem 3.3 (Self- delivery) Let pi be a correct process participating in the E protocol. If Pi 
executes WAN-multicast (m) then eventually pi executes WAN-deliver (m). 

Proof : Notice that \{n + t + l)/2] < n — t, thus there are at least \{n + t + l)/2] correct pro- 
cesses in P. Thus, if pi sends <E, regular, pi, seq(m), H{m)> to every process in P, then at least 
\{n + 1 + l)/2] correct processes pj will receive it. Since no correct process receives a conflicting 
message from pi, each will acknowledge it, sending back a <E, ack,j>j, seq(m), H(m)>Xj message to 
Pi, thereby enabling delivery of m by ft. □ 

Theorem 3.4 (Reliability) Let pi and pj be two correct processes participating in the E protocol. 
If Pi performs WAN-deliver (m), then eventually pj performs WAN-deliver (m). 
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Proof : For p$ to deliver m, pi must have obtained a set A of valid acknowledgments for m 
from [~(n + i + l)/2] processes. If learns that pj delivered m, then by SM_Integrity we are 
done. Alternatively, if pi does not learn that pj delivered m, then after a timeout period pi 
sends <E, deliver, m, A> to pj. By Lemma 3. 1 ( 3|) , pj cannot have received a conflicting set of 
acknowledgements, so at the latest, upon receipt of <E, deliver, m, A> from pi, process pj performs 
WAN-deliver(m). □ 



Theorem 3.5 (Agreement) Let pi and pj be two correct processes participating in the E protocol. 
Let pi deliver a message m, pj deliver m! , such that sender (m) = sender {ml) and seq(m) = seq(m'). 
Then pi and pj delivered the same message, i.e., payload(m) = payload{m'). 

Proof : For pi to deliver m, pi must have obtained a set A of valid acknowledgments for m from 
\{n + t + l)/2] processes in P. Likewise, pj must have obtained a set A' of \(n + t + l)/2] valid 
acknowledgments for m'. By Lemma [0](||), A and A' do not conflict, and by the security of H, 
m = m' . □ 



4 The 3T Protocol 

In this section we introduce the 3T protocol. The improvement in this protocol over the E protocol 
above comes from designating a potential witness set for each message m based on the pair 
<sender(m), seq(m)>. The choice of potential witness set for p^s fc'th message is determined 
by a function W^xiPi, k), whose range is the set of subsets of exactly (3t + 1) distinct process id's. 
For simplicity, we denote W^xim) = W^t (sender (to), seq(m)). For efficiency, W%t could be chosen 
to distribute the load of witnessing over distinct sets of processes for different messages. Figure || 
provides the details of the 3T protocol. 

For each message m the 3T protocol uses a witness set of size 2t + 1 out of a potential witness 
set, W / 3r(?n), of 3t + 1 processes. The choice of the 2t + l threshold is significant, since it guarantees 
that a majority of the correct members of W^T{ m ) acknowledge the message m and thus no two 
conflicting messages can receive the required threshold. As less than a third of W^rijn) could be 
faulty, 3T ensures Integrity, Reliability, Self-delivery and Agreement as in the E protocol above. 
This protocol is used for failure-recovery in the activet protocol below, in order to guarantee Self- 
delivery. 

The overhead incurred (in faultless runs, and not measuring the Stability Mechanism) is 2t + 1 
signature generations and message exchanges per delivery. 



5 The active t Protocol 

In this section, we relax the requirements and provide a protocol that guarantees (only) Probabilistic 
Agreement. Thus, we allow the possibility that a small fraction of the delivered messages may 
conflict. 

The idea of the activet protocol is as follows: We make use of a uniformly distributed function R 
from input pairs of the form <sender(m), seq(m)> designating witness sets W ac u ve (Tn) of k processes 
in P. The function R is determined at set-up time, e.g., by seeding it with some random value that 
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1. For a process pi to perform WAN-multicast(rn) , (such that sender{m) = pi), and pi has 
previously sent messages up to sequence number seq{m) — 1, pi sends 

<3T, regular, seq(m) , H (m)> 

to every process in W$r(jn)i and waits to obtain A = {<3T, ack,j?j, seq(m), H{rn)>K j \ Pj £ 
P'}, a set of signed acknowledgments from any subset P' of W3r(m) comprising of 2t + 1 
distinct processes. Process pi then sends to every process in P 

<3T, deliver, m, A> . 

2. When pi receives a message <3T, regular, pj , cnt, h> from pj , such that no conflicting message 
was previously received, pi sends to pj a signed acknowledgment 

<3T, ack, pj , cnt, h>Ki ■ 

3. When pi receives a message <3T, deliver, m, A>, such that A contains valid signatures for 
m from 2t + 1 members in W$r( m )i an d such that deliveryi[sender(m)] = seq(m) — 1, pi 
performs WAN-deliver(m) and sets deliver yi[sender{m)\ to seq(m). If a timeout period has 
passed and pj is not known to have delivered m, pi sends <3T, deliver, m, A> to pj. 



Figure 3: The 3T protocol 

processes choose collectively. By our assumptions, this means that R is unknown to the adversary 
in advance, and so the choice of which processes are faulty is made without knowledge of R. The 
size of Wactive, k, is set so that only an exponentially small fraction of the messages can have a 
witness set that contains only faulty processes (who may be collaborating with the sender) . If only 
t < \_(n — l)/3j members are faulty, then by the uniform distribution of R, the expected fraction 
of messages with a 'faulty' witness set is (^) K < (kj ■ Relatively small values of k are sufficient 
for this to be negligible. (As with the size of cryptographic keys, k is effectively a constant.) This 
idea of forming distributed trust in a cooperation-resilient way borrows from the time-stamping 
mechanism of Haber et al. Moreover, we stipulate that correct processes multicast messages 
in sequence order, and enforce this ordering on message delivery. This prevents a malicious sender 
from scanning off-line the domain of <sender, seq> pairs for ones that have faulty witness sets and 
sending only those messages. 

If R is invertible (an input pair can be easily computed from a desired output, i.e., a specific, 
presumably faulty, witness set) then after R is set, the adversary can compute which are the few 
messages it will be able to corrupt. The security of the active^ protocol can be enhanced by the 
use of a public random oracle R, that maps <sender(m), seq(m)> onto 2 P , such that the output 
cannot be distinguished from a random stream. It is assumed that the oracle can be accessed by 
all processes and responds to all queries with one (randomly chosen) mapping. By its randomness 
it is implied that the adversary cannot find inputs that map to faulty process sets. In practice, 
one adopts the random oracle methodology ||, [l]] to approximate R, e.g., use a hash function (such 
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as MD5 pO| ) in place of R, seeded with some input which is determined at set up time, e.g., by 
letting the processes collectively choose a random seed. Note again that by our assumptions, this 
implies that the adversary selects which processes are faulty without knowledge of R. Although 
this is widely done, we caution the reader that approximating R by a hash function has no proven 
security guarantees, and is only heuristically practically secure Q. 




Figure 4: The activet protocol - no-failure regime 

The motivation for this protocol is to choose witness sets significantly smaller than 2t + 1. As a 
result, assuring both safety and availability is a problem: Since t of the witnesses could be faulty, 
for availability one might want to wait for only k — t replies, but usually k — t < 0. Therefore, to 
guarantee availability of a witness set for every message, activet incorporates the 3T protocol as a 
recovery-regime. This is done as follows: For a process to send a message, it attempts to obtain 
signed acknowledgments from VF ac tj„ e (m). We name this the no-failure regime. After a timeout 
period, if has not obtained acknowledgments from all the members of W ac ti ve (m), pi reverts to 
the 3T protocol and re-sends m to WsT^m) to obtain signed acknowledgments from a subset of 
2t + 1 processes. This is called the recovery regime. 

The integration of the two protocols can potentially create an opportunity for a faulty process 
Pi to obtain signed acknowledgments for conflicting messages. Specifically, process pi could first 
obtain acknowledgements from the small number of witnesses in W ac ti ve {m) , then select a different 
set of 2t + 1 processes to act as witnesses in the recovery regime. To decrease such a possibility 
of delivering conflicting messages in combining the two regimes, we provide two measures: First, 
we turn the witnesses of the no-failure regime into active participants. The (correct) members 
of W ac ti ve (m) each probe W^ri^) at S randomly chosen peer processes before acknowledging a 
message. Since the peers are chosen by correct processes during protocol execution, (a correct) 
one is likely to be among any 2t + 1 processes chosen to act as witnesses in the recovery regime. 
Figure |I| depicts the active regime of the activet protocol. 

Secondly, we stipulate that any correct process that receives (signed) conflicting messages im- 
mediately alerts the entire system. To guarantee that alerting the system will prevent conflicting 



10 



messages from being delivered, in the recovery regime we force a delay before sending an ac- 
knowledgement. By our assumption model, such delay guarantees, with high probability, that any 
pending alert message will arrive at all correct processes. In practice, this delay can be reasonably 
small, e.g., by securing certain bandwidth for control messages and allowing out-of-band delivery 
of urgent communication. 

In order to allow witnesses to probe their peers on behalf of sender(m) , we require every process 
to sign its own "acknowledgement-seeking" (regular) messages. The peer processes record the 
message and do not reply if it conflicts with a previous message. Hence, knowledge of the message 
m propagates randomly among correct processes, without incurring additional signature overhead. 
In this way, if a message m! conflicting with m has been sent to a set S C W^i 171 ') (= W3T( m )) 
then with high probability the peers chosen on behalf of m intersect S at a correct process. More 
precisely, the probability that 5 random probes cross a correct member of a recovery set containing 
2t + 1 processes is at least 1 — fg^pjj > 1 — . Therefore, the parameter 5 can be chosen to 
achieve any desired level of probabilistic guarantee. 

The details of the protocol are given in Figure ||. 



1. For a process pi to WAN-multicast(m) , (such that sender(m) = Pi), and pi has previously 
sent messages up to sequence number seq(m) — 1, pi sends 

<AV, regular, pi, seq(m), H(m), sign> 

to each pj G W ac ti ve (m), where sign = (pi, seq(m), H(m))K t - It then waits to obtain the set 
of k acknowledgments A = {<AV, ack,pj, seq(m), H(m), sign>Xj \ Pj € W ac tive(ni)} ■ 

If a timeout period has passed and pi does not obtain acknowledgments from all processes in 
Wactiveim), then pt sends 

<3T, regular, pi, seq(m), H(m)> 

to W3T(ni), and waits to obtain A = {<3T, ack,pj, seq(m), H(m)>K j \ Pj € P'}, a set of signed 
acknowledgments from any subset P' of Wyr{ m ) of 2t + 1 distinct processes. 

In either case, pi then sends to P 

<AV, deliver, m, A> . 

2. When pi receives a message <AV, regular, pj, cnt, h, sign>, where sign is a valid signature of 
Pj on <pj, cnt, h>, it performs the active phase of secure message transmission: If no conflict- 
ing message was previously received, p\ randomly selects S target processes in W^riPj, cnt), 
denoted peersj. It sends 

<AV, inform,^-, cnt, h, sign> 

to every pk S peersi to obtain a message <AV, verify, pj, cnt, h> from pk- Upon receiving all 
5 verifications, it then sends to pj a signed acknowledgment 

<kV ,a.ck,pj,cnt,h, sign> Ki ■ 
Note that pi does not send back to pj any information about peersj. 
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3. When pi receives a message <AV, inform, pj, cnt, h, sign> from pk, where sign is a valid signa- 
ture of pj on <pj,cnt, h>, such that no conflicting message was previously received, it sends 
to p k 

<AV, verify, pj, cnt, h> . 

4. When pi receives a message <3T, regular, pj , cnt, h> from pj , such that no conflicting message 
was previously received, it delays for a pre-determined timeout period and then sends to pj 
a signed acknowledgment 

<3T, ack, pj , cnt, h>Ki ■ 

5. When pi receives <AV, deliver, m, A>, such that A contains a valid set of AV-acknowledgments 
from every member in W ac ti ve (m), or a valid set of 2t + 1 3T-acknowledgments from W3t(w), 
and such that deliveryi[sender(m)] = seq(m) — 1, pi performs WAN-deliver(m) and sets 
deliveryi[sender(m)] to seq(m). If a timeout period has passed and pj is not known to have 
delivered a message whose sequence number is seq(m) from sender (m), pi sends <AV, deliver, 
m, A> to pj. 



Figure 5: The activet protocol 

Throughout the protocol, if pi receives conflicting messages m and m' properly signed by sender 
Pj, pi immediately sends all processes alerting message containing m and m', using the fastest 
communication channels available to it. The alert message identifies without doubt a failure in pj 
due to the signatures on m, m! . Once pj is known to have failed, all correct processes avoid message 
exchange with pj. Typically, a malicious sender may be deterred from sending conflicting messages 
as their presence would unquestionably implicate it. 

Analysis 

The activet protocol aims to maximize performance in faultless cases by minimizing the number 
of signed messages and the number of overall message exchanges. Stress is placed on minimizing 
digital signatures (to effectively a constant number per message), since the cost of producing digital 
signatures in software is at least one order of magnitude higher than message-sending, for typical 
message sizes. 

The overhead in forming agreement on message contents in runs without failures or pre-mature 
timeouts is k signature generations and k message exchanges for collecting W ac u ve acknowledgments 
and 5 x n authenticated message exchanges with peers. We note that all of the overhead messages 
are small (containing fixed size hashes, signatures, and the like), signatures may be computed 
concurrently at all of the witnesses, and likewise, all pairs of message exchanges with peers may be 
done concurrently. 

The overhead in case of failures can reach, in the worst case scenario, k + 3t + 1 signatures 
and message exchanges with witnesses of both the no-failure regime and the recovery regime, and 
additionally, 5 x k authenticated message exchanges between witnesses and their peers. In addition, 
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the recovery regime incurs a delay on acknowledgement sending to allow for possibly pending alert 
messages to reach their destination. 

The level of guarantee achieved by activet depends on the parameters k and 5. If the system 
contains as many as \_(n — l)/3j faulty processes that know about each other (the worst case 
scenario), then one out of 2> K messages, on average, will have a completely faulty W ac tive set. Since 
W ac ti ve (m) is a function of sender(m) and seq(m), whenever W a ctiveijn) has a completely faulty 
witness set, sender(m) has the opportunity to collude with the faulty witnesses, and convince correct 
processes to WAN-deliver conflicting messages. Moreover, since the W ac u ve function is known to all 
participants, once the faulty processes and the W ac tive function are determined, the adversary can 
predict the sequence number and sender of messages for which it can so collude and cause conflicting 
WAN-deliver events. Nonetheless, this is the case for only an exponentially-small fraction of the 
messages that are sent. By proper choice of the parameter k, and given that messages are multicast 
in sequence order, then the likelihood of such a message occurring in the lifetime of the system can 
be made appropriately small. 

There is also a chance of obtaining acknowledgments signed by correct members for conflicting 
messages by having non-intersecting sets of correct processes participate in the two protocol regimes, 
as follows: A faulty process pi could generate conflicting messages m, m', sent to W ac ti ve (m) and 
S respectively, where S C W3t(W), \S\ = 2t + 1 and S fl W ac ti ve {m) = 0. However, if W ac ti ve (m) 
contains at least one correct member p^, then the probability that peersh does not intersect S at a 
correct member is at most , which can be made as small as desired by choosing S appropriately. 

For example, in a network of 100 processes, and assuming the number of faulty processes t < 10, 
choosing k = 3, 5 = 5 will guarantee that conflicting messages are detected with probability at 
least 0.95, whereas in a network of 1000 processes with t < 100, we can achieve 0.998 guarantee 
level with k = 4, S = 10. 

Proof of Correctness 

We now proceed to prove Integrity, Self-delivery, Reliability and Probabilistic Agreement of the 
activet protocol. We note that due to the possibility of a completely faulty W ac ti ve set, we cannot 
always leverage the correctness of the active t protocol from that of the E protocol, as we did in the 
3T protocol. 

We begin with a statement of several useful properties of the protocol from which its security 
leverages. We note that in any run of the activet protocol, no correct process multicasts conflicting 
messages and no correct process signs conflicting acknowledgements (ST or AV). In addition, we 
have the following properties: 

Lemma 5.1 In any run of the activet protocol, the following hold: 
1 . If process p is correct in a run, then: 

(a) a correct process signs a 3T acknowledgement for a message m with sender (m) = p only 
if p multicasts m, and 

(b) a ualid signed AV acknowledgement for a message m with sender (m) = p can be formed 
only if p multicasts m. 
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2. If process p is correct in a run, and the run contains a set of valid 3T or AV acknowledgements 
for m with sender (m) = p, then p WAN-multicast message m in the run. 

Proof : A correct process q signs a 3T acknowledgement for m only when it receives m over an 
authenticated channel from sender(m). Likewise, a correct process signs an AV acknowledgement 
only when the AV message contains a valid signature of sender{m). The first property then follows 
from the fact that a signed acknowledgement of the form <3T, ack,p, cnt, H(m)>K q contains the 
identity of the sender p = sender(m), and a signed acknowledgement <AV, ack,p, cnt, H(m), sign>x q 
contains both the sender's identity and its signature. 

For the second property, note that if the run contains a valid set of 3T acknowledgements for 
m, at least one of which must be from a correct process, then by item (Q) of the lemma m was 
multicast by p. Likewise, if a valid set of AV acknowledgements are formed for m, then again by 
item (U) of the lemma m was multicast by p. □ 



Theorem 5.1 (Integrity) Let pi be a correct process participating in the activet protocol. Then pi 
executes WAN-deliver (m) at most once, and if sender{m) is correct, then only ifsender(m) executed 
WAN-multicast(m) . 

Proof : That pi suppresses duplicate deliveries is immediate from the protocol. It is left to show 
that if sender(m) is correct, it must have sent m. For pi to deliver m, pi must have obtained a 
valid set of either AV acknowledgments or of 3T acknowledgements for m. By Lemma 5.1(2|), m 
was WAN-multicast(m) by a correct sender(m). □ 



Theorem 5.2 (Self-delivery) Let pi be a correct process participating in the activet protocol. If Pi 
executes WAN-multicast (m) then pi executes WAN-deliver (m) . 

Proof : The theorem easily follows from the Self-delivery property of the 3T protocol, which is 
employed within some timeout from WAN-multicast(m) unless a valid set of AV acknowledgements 
for m is received first, enabling its delivery by pi. □ 



Theorem 5.3 (Reliability) Letpi and pj be two correct processes participating in the activet proto- 
col. If pi performs WAN-deliver (m), such that seq(m) = seq and sender (m) = pt, then pj performs 
WAN-deliver (m' ) such that seq(m') = seq and sender (m') = p^. 

Proof : For pi to deliver m, pi must have obtained a valid set A of either AV acknowledgments 
or of 3T acknowledgements for m. If pi learns that pj delivered m' satisfying seq(m') = seq and 
sender (m') = p^, then by SM_Integrity we are done. Alternatively, if pi does not learn that pj 
delivered such m! , then after a timeout period pi sends <AV, deliver, m, A> to pj. If pj has not 
delivered any conflicting message, then upon receipt of <AV, deliver, m, A> from pj, process pj 
performs WAN-deliver(m) . Otherwise, pj delivers some message m! satisfying seq(m') = seq and 
sender(m') = p^. In either case, we are done. (We note that, if sender(m) is correct, then by 
Integrity m = m' .) □ 



14 



Note that unlike the E and 3T protocols, in the activet protocol two correct processes are only 
guaranteed to deliver the same sequenced message, and not necessarily the same message. This 
follows because the protocol only satisfies the Probabilistic Agreement property, which allows, with 
some small probability, the delivery of conflicting messages by different processes. 

We now prove that activet maintains Probabilistic Agreement: 

Theorem 5.4 (Probabilistic Agreement) Let pi and pj be two correct processes participating in the 
activet protocol. Let pi deliver a message m, pj deliver ml ' , such that sender(m) = sender(m') and 
seq{m) = seq(m'). Then the probability that pi and pj delivered conflicting messages, i.e., m 7^ m' , 

w at most (wh) 5 < (l) 5 Q 

Proof : As argued in Theorem |3.5| above, for pi and pj to deliver m and m', respectively, they must 
have each delivered corresponding sets of valid acknowledgments A and A'. Denote by witness(m) 
the set of processes represented in A, and likewise witness(m'). If witness(m) and witness(m') 
intersect in an correct process, then we argue that m = m! as in Theorem [T5|. It remains to 
compute the probability that conflicting message delivery is enabled in the case that witness(m) 
does not intersect witness(m') at any correct member. 

Case 1: witness(m) = witness{m!) = W ac ti ve {m). Thus, W ac ti ve (m) contains faulty members only. 
By assumption, the adversary chooses which processes are faulty without knowledge of R, 
and hence for any m, W ac ti ve (m) = R(sender(m), seq(m)) randomizes the choice of processes 
as a function of <sender(m), seq(m)> independently of failures. Hence, the probability P K for 
this event is at most P K < (^) K < (j^J . 

Case 2: witness(m) 7^ W ac ti ve (m), witness(m') 7^ W ac ti ve {m'). Note that W$r{jri) = War^Oi 
and in this case, witness(m) , witness(m') C WzTijn) must intersect in a correct process, 
leading to a contradiction. 

Case 3: (W.l.o.g.) witness(m) = W ac ti ve {m) , witness(m') C Wyr{m!'), \ witness {rn')\ = 2t + 1. 
To distinguish from case 1, assume that witness{m) contains at least one correct member 
Ph.. Note that the correct member p^ chooses peers randomly, and does not disclose the 
composition of peers^ to sender(m'). Moreover, by assumption there is a positive probability 
for each message from sender{m') to Wsrijn') to reach its destination (independent of the 
choice of peersh). Thus, the choice of peersh is independent from the choice of any process 
in witness(m'). Therefore, the probability that p^ does not reach any correct member in 

witness(m') in 5 probes is at most (jj^pj) < f f j ■ 
Thus, the overall probability for conflicting message to be deliverable is bounded by (h^j + 

Obviously, the probability above can be made as small as desired by appropriate choice of k, 5, 
for appropriate system sizes (i.e., such that n — t > kS). 

Here, we take the probability that an alerting message reaches correct processes in time to be exactly 1. By 
assumption, this probability approximates 1 as closely as desired by appropriate tuning of delays. 
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Optimizations 



It is possible to improve the fault tolerance of the activet protocol family by allowing any subset of 
k — C witnesses out of the designated W ac tive set, where C is some constant, to validate a message. 
Unfortunately, while this improves resilience to benign failures, it increases the probability of a 
faulty witness set. Nevertheless, suppose that t = \_(n — 1 ) / 3 J . The probability P Ky c of a faulty set 
of k — C out of k randomly chosen processes is bounded by: 



K,C -~ / , 
3=0 



C(n-K)J V3 



-c 



This probability tends to zero if we choose C < k. Therefore, this allows us to increase the 
fault tolerance while preserving safety to any desirable degree. 

Similar improvement can be made by accommodating failures in the peer sets designated by 
processes in the active probing phase. The details of the error probabilities induced by such 
optimizations can be easily worked out, similarly to the error probability above. 



6 Load 

Our protocols were designed to bring down the cost of forming agreement on message delivery. This 
was done by reducing the size of witness sets used in our protocols. A related measure of protocol 
efficiency is the load it incurs over participating processes, where by load we mean the expected 
maximum number of times any server is accessed per message. To compute load, we need to grow 
a set M of randomly selected messages to infinity, and examine the number of accesses at the 



busiest server divided by \M\. (This definition is motivated by Naor and Wool [15|, adapting their 
definition of load to our case where distinct messages have different witness ranges.) We remark 
that this definition does not distinguish between the accesses requiring a server to sign messages 
and ones requiring it only to respond. 

We first look at the expected access probability of the busiest server in runs without failures or 
premature timeouts. For the 3T protocol, the witness sets of messages are subsets of 2t + l processes 
(each chosen out of a designated range W%r of 3t + 1). If the W^rim) function randomizes the 
choice of processes and likewise, within every witness range 2t + l processes are selected randomly, 
then as the number of (randomly selected) messages grows, the failure-free load on the busiest 
server tends to (2i + 1) /re- 
in the activet protocol with parameters n and 5, a message m's delivery involves accessing (in 
runs without failures or pre-mature timeouts) a set W ac tive{fn) of k witnesses and a choice of k x 5 
peer processes. The choice of W ac u ve (m) and peers is randomized, giving uniform probabilities 
for each process to be accessed when taken at the limit, as the set of messages goes to infinity. 
Therefore, the failure-free load of the activet protocol is k{5 + l)/re. 

If failures occur in the 3T protocol, then the load incurred is bounded by (3t + l)/n. This 
might be acceptable if t <C n. In the activet protocol, failures may prevent access to W ac ti ve (m) 
or to some peers and require accessing some subset of W3r(m) processes for recovery. The load of 
activet in case of failures is bounded by (k(5 + 1) + (3t + l))/re. 
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7 Conclusions 



Experience in constructing robust distributed systems |16|, [7|, |Tl], [H| shows that (secure) reliable 
multicast is an important tool for distributed applications. Implementing reliable multicast in an 
insecure environment with arbitrary failures incurs inevitable overhead required for maintaining 
consistency. However, a price that may be acceptable in a small network becomes intolerable for a 
very large system. 

In this paper, we have shown two approaches in which the requirements on the system may 
be weakened in order to allow for more efficient implementations of reliable multicast: The first is 
suitable for environments in which failures are rare, and where therefore, it is reasonable to assume 
a low threshold on the number of failures. The second relaxes the consistency requirement to allow 
an exponentially small fraction of the messages to be delivered inconsistently. This approach is 
practical when reversing the effects of (a small number of) bad message deliveries is possible. In 
both cases, we have devised protocols that meet the requirements and incur costs that do not grow 
with the system size, in normal faultless scenarios. 
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